Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 14.597
Filtrar
1.
Epigenetics ; 19(1): 2333660, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38564759

RESUMO

DNA methylation (DNAm) plays a crucial role in a number of complex diseases. However, the reliability of DNAm levels measured using Illumina arrays varies across different probes. Previous research primarily assessed probe reliability by comparing duplicate samples between the 450k-450k or 450k-EPIC platforms, with limited investigations on Illumina EPIC v1.0 arrays. We conducted a comprehensive assessment of the EPIC v1.0 array probe reliability using 69 blood DNA samples, each measured twice, generated by the Alzheimer's Disease Neuroimaging Initiative study. We observed higher reliability in probes with average methylation beta values of 0.2 to 0.8, and lower reliability in type I probes or those within the promoter and CpG island regions. Importantly, we found that probe reliability has significant implications in the analyses of Epigenome-wide Association Studies (EWAS). Higher reliability is associated with more consistent effect sizes in different studies, the identification of differentially methylated regions (DMRs) and methylation quantitative trait locus (mQTLs), and significant correlations with downstream gene expression. Moreover, blood DNAm measurements obtained from probes with higher reliability are more likely to show concordance with brain DNAm measurements. Our findings, which provide crucial reliability information for probes on the EPIC v1.0 array, will serve as a valuable resource for future DNAm studies.


Assuntos
Metilação de DNA , Locos de Características Quantitativas , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reprodutibilidade dos Testes , Ilhas de CpG
2.
BMC Plant Biol ; 24(1): 306, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38644480

RESUMO

Linkage maps are essential for genetic mapping of phenotypic traits, gene map-based cloning, and marker-assisted selection in breeding applications. Construction of a high-quality saturated map requires high-quality genotypic data on a large number of molecular markers. Errors in genotyping cannot be completely avoided, no matter what platform is used. When genotyping error reaches a threshold level, it will seriously affect the accuracy of the constructed map and the reliability of consequent genetic studies. In this study, repeated genotyping of two recombinant inbred line (RIL) populations derived from crosses Yangxiaomai × Zhongyou 9507 and Jingshuang 16 × Bainong 64 was used to investigate the effect of genotyping errors on linkage map construction. Inconsistent data points between the two replications were regarded as genotyping errors, which were classified into three types. Genotyping errors were treated as missing values, and therefore the non-erroneous data set was generated. Firstly, linkage maps were constructed using the two replicates as well as the non-erroneous data set. Secondly, error correction methods implemented in software packages QTL IciMapping (EC) and Genotype-Corrector (GC) were applied to the two replicates. Linkage maps were therefore constructed based on the corrected genotypes and then compared with those from the non-erroneous data set. Simulation study was performed by considering different levels of genotyping errors to investigate the impact of errors and the accuracy of error correction methods. Results indicated that map length and marker order differed among the two replicates and the non-erroneous data sets in both RIL populations. For both actual and simulated populations, map length was expanded as the increase in error rate, and the correlation coefficient between linkage and physical maps became lower. Map quality can be improved by repeated genotyping and error correction algorithm. When it is impossible to genotype the whole mapping population repeatedly, 30% would be recommended in repeated genotyping. The EC method had a much lower false positive rate than did the GC method under different error rates. This study systematically expounded the impact of genotyping errors on linkage analysis, providing potential guidelines for improving the accuracy of linkage maps in the presence of genotyping errors.


Assuntos
Mapeamento Cromossômico , Genótipo , Triticum , Triticum/genética , Mapeamento Cromossômico/métodos , Locos de Características Quantitativas , Ligação Genética , Técnicas de Genotipagem/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
3.
Biomed Microdevices ; 26(2): 20, 2024 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-38430318

RESUMO

Polymerase chain reaction (PCR) has been considered as the gold standard for detecting nucleic acids. The simple PCR system is of great significance for medical applications in remote areas, especially for the developing countries. Herein, we proposed a low-cost self-assembled platform for microchamber PCR. The working principle is rotating the chamber PCR microfluidic chip between two heaters with fixed temperature to solve the problem of low temperature variation rate. The system consists of two temperature controllers, a screw slide rail, a chamber array microfluidic chip and a self-built software. Such a system can be constructed at a cost of about US$60. The micro chamber PCR can be finished by rotating the microfluidic chip between two heaters with fixed temperature. Results demonstrated that the sensitivity of the temperature controller is 0.1℃. The relative error of the duration for the microfluidic chip was 0.02 s. Finally, we successfully finished amplification of the target gene of Porphyromonas gingivalis in the chamber PCR microfluidic chip within 35 min and on-site detection of its PCR products by fluorescence. The chip consisted of 3200 cylindrical chambers. The volume of reagent in each volume is as low as 0.628 nL. This work provides an effective method to reduce the amplification time required for micro chamber PCR.


Assuntos
Microfluídica , Microfluídica/métodos , Temperatura , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reação em Cadeia da Polimerase/métodos
4.
Biosens Bioelectron ; 253: 116172, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38460210

RESUMO

Simultaneous multiplexed analysis can provide comprehensive information for disease diagnosis. However, the current multiplex methods rely on sophisticated barcode technology, which hinders its wider application. In this study, an ultrasimple size encoding method is proposed for multiplex detection using a wedge-shaped microfluidic chip. Driving by negative pressure, microparticles are naturally arranged in distinct stripes based on their sizes within the chip. This size encoding method demonstrates a high level of precision, allowing for accuracy in distinguishing 3-5 sizes of microparticles with a remarkable accuracy rate of up to 99%, even the microparticles with a size difference as small as 0.5 µm. The entire size encoding process is completed in less than 5 min, making it ultrasimple, reliable, and easy to operate. To evaluate the function of this size encoding microfluidic chip, three commonly co-infectious viruses' nucleic acid sequences (including complementary DNA sequences of HIV and HCV, and DNA sequence of HBV) are employed for multiplex detection. Results indicate that all three DNA sequences can be sensitively detected without any cross-interference. This size-encoding microfluidic chip-based multiplex detection method is simple, rapid, and high-resolution, its successful application in serum samples renders it highly promising for potential clinical promotion.


Assuntos
Técnicas Biossensoriais , Técnicas Analíticas Microfluídicas , Microfluídica , Sequência de Bases , Técnicas Analíticas Microfluídicas/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos
5.
Comput Biol Med ; 170: 108089, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38330824

RESUMO

Gene selection is a process of selecting discriminative genes from microarray data that helps to diagnose and classify cancer samples effectively. Swarm intelligence evolution-based gene selection algorithms can never circumvent the problem that the population is prone to local optima in the process of gene selection. To tackle this challenge, previous research has focused primarily on two aspects: mitigating premature convergence to local optima and escaping from local optima. In contrast to these strategies, this paper introduces a novel perspective by adopting reverse thinking, where the issue of local optima is seen as an opportunity rather than an obstacle. Building on this foundation, we propose MOMOGS-PCE, a novel gene selection approach that effectively exploits the advantageous characteristics of populations trapped in local optima to uncover global optimal solutions. Specifically, MOMOGS-PCE employs a novel population initialization strategy, which involves the initialization of multiple populations that explore diverse orientations to foster distinct population characteristics. The subsequent step involved the utilization of an enhanced NSGA-II algorithm to amplify the advantageous characteristics exhibited by the population. Finally, a novel exchange strategy is proposed to facilitate the transfer of characteristics between populations that have reached near maturity in evolution, thereby promoting further population evolution and enhancing the search for more optimal gene subsets. The experimental results demonstrated that MOMOGS-PCE exhibited significant advantages in comprehensive indicators compared with six competitive multi-objective gene selection algorithms. It is confirmed that the "reverse-thinking" approach not only avoids local optima but also leverages it to uncover superior gene subsets for cancer diagnosis.


Assuntos
Algoritmos , Neoplasias , Humanos , Neoplasias/diagnóstico , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos
6.
Nucleic Acids Res ; 52(7): e38, 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38407446

RESUMO

The Infinium BeadChip is the most widely used DNA methylome assay technology for population-scale epigenome profiling. However, the standard workflow requires over 200 ng of input DNA, hindering its application to small cell-number samples, such as primordial germ cells. We developed experimental and analysis workflows to extend this technology to suboptimal input DNA conditions, including ultra-low input down to single cells. DNA preamplification significantly enhanced detection rates to over 50% in five-cell samples and ∼25% in single cells. Enzymatic conversion also substantially improved data quality. Computationally, we developed a method to model the background signal's influence on the DNA methylation level readings. The modified detection P-value calculation achieved higher sensitivities for low-input datasets and was validated in over 100 000 public diverse methylome profiles. We employed the optimized workflow to query the demethylation dynamics in mouse primordial germ cells available at low cell numbers. Our data revealed nuanced chromatin states, sex disparities, and the role of DNA methylation in transposable element regulation during germ cell development. Collectively, we present comprehensive experimental and computational solutions to extend this widely used methylation assay technology to applications with limited DNA.


Assuntos
Metilação de DNA , Células Germinativas , Análise de Célula Única , Animais , Análise de Célula Única/métodos , Camundongos , Células Germinativas/metabolismo , Feminino , Masculino , Epigenômica/métodos , Humanos , Epigenoma , Análise de Sequência com Séries de Oligonucleotídeos/métodos , DNA/genética , DNA/metabolismo , Epigênese Genética , Ilhas de CpG
7.
Nat Commun ; 15(1): 1366, 2024 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-38355558

RESUMO

Efficient pathogen enrichment and nucleic acid isolation are critical for accurate and sensitive diagnosis of infectious diseases, especially those with low pathogen levels. Our study introduces a biporous silica nanofilms-embedded sample preparation chip for pathogen and nucleic acid enrichment/isolation. This chip features unique biporous nanostructures comprising large and small pore layers. Computational simulations confirm that these nanostructures enhance the surface area and promote the formation of nanovortex, resulting in improved capture efficiency. Notably, the chip demonstrates a 100-fold lower limit of detection compared to conventional methods used for nucleic acid detection. Clinical validations using patient samples corroborate the superior sensitivity of the chip when combined with the luminescence resonance energy transfer assay. The enhanced sample preparation efficiency of the chip, along with the facile and straightforward synthesis of the biporous nanostructures, offers a promising solution for polymer chain reaction-free detection of nucleic acids.


Assuntos
Nanoestruturas , Ácidos Nucleicos , Humanos , Microfluídica , Dióxido de Silício , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Técnicas de Amplificação de Ácido Nucleico
8.
Comput Biol Chem ; 109: 108009, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38219419

RESUMO

Many soft biclustering algorithms have been developed and applied to various biological and biomedical data analyses. However, few mutually exclusive (hard) biclustering algorithms have been proposed, which could better identify disease or molecular subtypes with survival significance based on genomic or transcriptomic data. In this study, we developed a novel mutually exclusive spectral biclustering (MESBC) algorithm based on spectral method to detect mutually exclusive biclusters. MESBC simultaneously detects relevant features (genes) and corresponding conditions (patients) subgroups and, therefore, automatically uses the signature features for each subtype to perform the clustering. Extensive simulations revealed that MESBC provided superior accuracy in detecting pre-specified biclusters compared with the non-negative matrix factorization (NMF) and Dhillon's algorithm, particularly in very noisy data. Further analysis of the algorithm on real datasets obtained from the TCGA database showed that MESBC provided more accurate (i.e., smaller p-value) overall survival prediction in patients with lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cancers when compared to the existing, gold-standard subtypes for lung cancers (integrative clustering). Furthermore, MESBC detected several genes with significant prognostic value in both LUAD and LUSC patients. External validation on an independent, unseen GEO dataset of LUAD showed that MESBC-derived clusters based on TCGA data still exhibited clear biclustering patterns and consistent, outstanding prognostic predictability, demonstrating robust generalizability of MESBC. Therefore, MESBC could potentially be used as a risk stratification tool to optimize the treatment for the patient, improve the selection of patients for clinical trials, and contribute to the development of novel therapeutic agents.


Assuntos
Adenocarcinoma de Pulmão , Carcinoma Pulmonar de Células não Pequenas , Carcinoma de Células Escamosas , Neoplasias Pulmonares , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Perfilação da Expressão Gênica/métodos , Algoritmos , Neoplasias Pulmonares/genética
9.
J Comput Biol ; 31(1): 71-82, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38010511

RESUMO

The analysis of gene expression data has made significant contributions to understanding disease mechanisms and developing new drugs and therapies. In such analysis, gene selection is often required for identifying informative and relevant genes and removing redundant and irrelevant ones. However, this is not an easy task as gene expression data have inherent challenges such as ultra-high dimensionality, biological noise, and measurement errors. This study focuses on the measurement errors in gene selection problems. Typically, high-throughput experiments have their own intrinsic measurement errors, which can result in an increase of falsely discovered genes. To alleviate this problem, this study proposes a gene selection method that takes into account measurement errors using generalized liner measurement error models. The method consists of iterative filtering and selection steps until convergence, leading to fewer false positives and providing stable results under measurement errors. The performance of the proposed method is demonstrated through simulation studies and applied to a lung cancer data set.


Assuntos
Perfilação da Expressão Gênica , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Simulação por Computador
10.
Commun Biol ; 6(1): 1151, 2023 11 13.
Artigo em Inglês | MEDLINE | ID: mdl-37953348

RESUMO

The function of regulatory elements is highly dependent on the cellular context, and thus for understanding the function of elements associated with psychiatric diseases these would ideally be studied in neurons in a living brain. Massively Parallel Reporter Assays (MPRAs) are molecular genetic tools that enable functional screening of hundreds of predefined sequences in a single experiment. These assays have not yet been adapted to query specific cell types in vivo in a complex tissue like the mouse brain. Here, using a test-case 3'UTR MPRA library with genomic elements containing variants from autism patients, we developed a method to achieve reproducible measurements of element effects in vivo in a cell type-specific manner, using excitatory cortical neurons and striatal medium spiny neurons as test cases. This targeted technique should enable robust, functional annotation of genetic elements in the cellular contexts most relevant to psychiatric disease.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos , Sequências Reguladoras de Ácido Nucleico , Animais , Humanos , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Regiões 3' não Traduzidas , Córtex Cerebral , Neurônios Espinhosos Médios
11.
BMC Bioinformatics ; 24(1): 408, 2023 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-37904108

RESUMO

BACKGROUND: Gene-wise differential expression is usually the first major step in the statistical analysis of high-throughput data obtained from techniques such as microarrays or RNA-sequencing. The analysis at gene level is often complemented by interrogating the data in a broader biological context that considers as unit of measure groups of genes that may have a common function or biological trait. Among the vast number of publications about gene set analysis (GSA), the rotation test for gene set analysis, also referred to as roast, is a general sample randomization approach that maintains the integrity of the intra-gene set correlation structure in defining the null distribution of the test. RESULTS: We present roastgsa, an R package that contains several enrichment score functions that feed the roast algorithm for hypothesis testing. These implemented methods are evaluated using both simulated and benchmarking data in microarray and RNA-seq datasets. We find that computationally intensive measures based on Kolmogorov-Smirnov (KS) statistics fail to improve the rates of simpler measures of GSA like mean and maxmean scores. We also show the importance of accounting for the gene linear dependence structure of the testing set, which is linked to the loss of effective signature size. Complete graphical representation of the results, including an approximation for the effective signature size, can be obtained as part of the roastgsa output. CONCLUSIONS: We encourage the usage of the absmean (non-directional), mean (directional) and maxmean (directional) scores for roast GSA analysis as these are simple measures of enrichment that have presented dominant results in all provided analyses in comparison to the more complex KS measures.


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Perfilação da Expressão Gênica/métodos , Rotação , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Fenótipo
12.
Anal Chem ; 95(41): 15384-15393, 2023 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-37801728

RESUMO

Glass is by far the most common substrate for biomolecular arrays, including high-throughput sequencing flow cells and microarrays. The native glass hydroxyl surface is modified by using silane chemistry to provide appropriate functional groups and reactivities for either in situ synthesis or surface immobilization of biologically or chemically synthesized biomolecules. These arrays, typically of oligonucleotides or peptides, are then subjected to long incubation times in warm aqueous buffers prior to fluorescence readout. Under these conditions, the siloxy bonds to the glass are susceptible to hydrolysis, resulting in significant loss of biomolecules and concomitant loss of signal from the assay. Here, we demonstrate that functionalization of glass surfaces with dipodal silanes results in greatly improved stability compared to equivalent functionalization with standard monopodal silanes. Using photolithographic in situ synthesis of DNA, we show that dipodal silanes are compatible with phosphoramidite chemistry and that hybridization performed on the resulting arrays provides greatly improved signal and signal-to-noise ratios compared with surfaces functionalized with monopodal silanes.


Assuntos
Ensaios de Triagem em Larga Escala , Silanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Silanos/química , Hibridização de Ácido Nucleico/métodos , DNA/química , Vidro/química , Propriedades de Superfície
13.
PLoS One ; 18(8): e0289971, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37561760

RESUMO

As breast cancer is a multistage progression disease resulting from a genetic sequence of mutations, understanding the genes whose expression values increase or decrease monotonically across pathologic stages can provide insightful clues about how breast cancer initiates and advances. Utilizing variational autoencoder (VAE) networks in conjunction with traditional statistical testing, we successfully ascertain long non-coding RNAs (lncRNAs) that exhibit monotonically differential expression values in breast cancer. Subsequently, we validate that the identified lncRNAs really present monotonically changed patterns. The proposed procedure identified 248 monotonically decreasing expressed and 115 increasing expressed lncRNAs. They correspond to a total of 65 and 33 genes respectively, which possess unique known gene symbols. Some of them are associated with breast cancer, as suggested by previous studies. Furthermore, enriched pathways by the target mRNAs of these identified lncRNAs include the Wnt signaling pathway, human papillomavirus (HPV) infection, and Rap 1 signaling pathway, which have been shown to play crucial roles in the initiation and development of breast cancer. Additionally, we trained a VAE model using the entire dataset. To assess the effectiveness of the identified lncRNAs, a microarray dataset was employed as the test set. The results obtained from this evaluation were deemed satisfactory. In conclusion, further experimental validation of these lncRNAs with a large-sized study is warranted, and the proposed procedure is highly recommended.


Assuntos
Neoplasias da Mama , RNA Longo não Codificante , Humanos , Feminino , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , Neoplasias da Mama/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Via de Sinalização Wnt , RNA Mensageiro/metabolismo , Perfilação da Expressão Gênica
14.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37419612

RESUMO

Missing values (MVs) can adversely impact data analysis and machine-learning model development. We propose a novel mixed-model method for missing value imputation (MVI). This method, ProJect (short for Protein inJection), is a powerful and meaningful improvement over existing MVI methods such as Bayesian principal component analysis (PCA), probabilistic PCA, local least squares and quantile regression imputation of left-censored data. We rigorously tested ProJect on various high-throughput data types, including genomics and mass spectrometry (MS)-based proteomics. Specifically, we utilized renal cancer (RC) data acquired using DIA-SWATH, ovarian cancer (OC) data acquired using DIA-MS, bladder (BladderBatch) and glioblastoma (GBM) microarray gene expression dataset. Our results demonstrate that ProJect consistently performs better than other referenced MVI methods. It achieves the lowest normalized root mean square error (on average, scoring 45.92% less error in RC_C, 27.37% in RC_full, 29.22% in OC, 23.65% in BladderBatch and 20.20% in GBM relative to the closest competing method) and the Procrustes sum of squared error (Procrustes SS) (exhibits 79.71% less error in RC_C, 38.36% in RC full, 18.13% in OC, 74.74% in BladderBatch and 30.79% in GBM compared to the next best method). ProJect also leads with the highest correlation coefficient among all types of MV combinations (0.64% higher in RC_C, 0.24% in RC full, 0.55% in OC, 0.39% in BladderBatch and 0.27% in GBM versus the second-best performing method). ProJect's key strength is its ability to handle different types of MVs commonly found in real-world data. Unlike most MVI methods that are designed to handle only one type of MV, ProJect employs a decision-making algorithm that first determines if an MV is missing at random or missing not at random. It then employs targeted imputation strategies for each MV type, resulting in more accurate and reliable imputation outcomes. An R implementation of ProJect is available at https://github.com/miaomiao6606/ProJect.


Assuntos
Algoritmos , Genômica , Teorema de Bayes , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Espectrometria de Massas/métodos
15.
IEEE/ACM Trans Comput Biol Bioinform ; 20(5): 2802-2809, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37285246

RESUMO

Biclustering algorithms are essential for processing gene expression data. However, to process the dataset, most biclustering algorithms require preprocessing the data matrix into a binary matrix. Regrettably, this type of preprocessing may introduce noise or cause information loss in the binary matrix, which would reduce the biclustering algorithm's ability to effectively obtain the optimal biclusters. In this paper, we propose a new preprocessing method named Mean-Standard Deviation (MSD) to resolve the problem. Additionally, we introduce a new biclustering algorithm called Weight Adjacency Difference Matrix Binary Biclustering (W-AMBB) to effectively process datasets containing overlapping biclusters. The basic idea is to create a weighted adjacency difference matrix by applying weights to a binary matrix that is derived from the data matrix. This allows us to identify genes with significant associations in sample data by efficiently identifying similar genes that respond to specific conditions. Furthermore, the performance of the W-AMBB algorithm was tested on both synthetic and real datasets and compared with other classical biclustering methods. The experiment results demonstrate that the W-AMBB algorithm is significantly more robust than the compared biclustering methods on the synthetic dataset. Additionally, the results of the GO enrichment analysis show that the W-AMBB method possesses biological significance on real datasets.


Assuntos
Algoritmos , Perfilação da Expressão Gênica , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise por Conglomerados , Expressão Gênica
16.
Forensic Sci Int Genet ; 65: 102885, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37137205

RESUMO

Since the arrest of the Golden State Killer in the US in April 2018, forensic geneticists have been increasingly interested in the investigative genetic genealogy (IGG) method. While this method has already been in practical use as a powerful tool for criminal investigation, we have yet to know well the limitations and potential risks. In this current study, we performed an evaluation study focusing on degraded DNA using the Affymetrix Genome-Wide Human SNP Array 6.0 platform (Thermo Fisher Scientific). We revealed one of the potential problems that occur during SNP genotype determination using a microarray-based platform. Our analysis results indicated that the SNP profiles derived from degraded DNA contained many false heterozygous SNPs. In addition, it was confirmed that the total amount of probe signal intensity on microarray chips derived from degraded DNA decreased significantly. Because the conventional analysis algorithm performs normalization during genotype determination, we concluded that noise signals could be genotype-called. To address this issue, we proposed a novel microarray data analysis method without normalization (nMAP). Although the nMAP algorithm resulted in a low call rate, it substantially improved genotyping accuracy. Finally, we confirmed the usefulness of the nMAP algorithm for kinship inferences. These findings and the nMAP algorithm will make a contribution to the advance of the IGG method.


Assuntos
DNA , Imunoglobulina G , Humanos , Genótipo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , DNA/genética , Imunoglobulina G/genética , Polimorfismo de Nucleotídeo Único
17.
Methods Mol Biol ; 2639: 69-81, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37166711

RESUMO

In biology, molecular cascade signaling is an essential tool to mediate various pathways and downstream behaviors. Mimicking these molecular cascades plays an important role in synthetic biology. The use of DNA self-assembly represents an elegant way to build sophisticated molecular cascades. For instance, a DNA molecular array connected by a number of dynamic anti-junction units was able to realize prescribed, multistep, long-range cascaded transformation. The dynamic DNA molecular array is able to execute transformations with programmable initiation, propagation, and regulation. The transformation of the array can be initiated at selected units and then propagated, without addition of extra triggers, to neighboring units and eventually the entire array.


Assuntos
DNA , Nanotecnologia , DNA/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Nanotecnologia/métodos
18.
Comput Biol Med ; 158: 106854, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37023541

RESUMO

In recent times, microarray gene expression datasets have gained significant popularity due to their usefulness to identify different types of cancer directly through bio-markers. These datasets possess a high gene-to-sample ratio and high dimensionality, with only a few genes functioning as bio-markers. Consequently, a significant amount of data is redundant, and it is essential to filter out important genes carefully. In this paper, we propose the Simulated Annealing aided Genetic Algorithm (SAGA), a meta-heuristic approach to identify informative genes from high-dimensional datasets. SAGA utilizes a two-way mutation-based Simulated Annealing (SA) as well as Genetic Algorithm (GA) to ensure a good trade-off between exploitation and exploration of the search space, respectively. The naive version of GA often gets stuck in a local optimum and depends on the initial population, leading to premature convergence. To address this, we have blended a clustering-based population generation with SA to distribute the initial population of GA over the entire feature space. To further enhance the performance, we reduce the initial search space by a score-based filter approach called the Mutually Informed Correlation Coefficient (MICC). The proposed method is evaluated on 6 microarray and 6 omics datasets. Comparison of SAGA with contemporary algorithms has shown that SAGA performs much better than its peers. Our code is available at https://github.com/shyammarjit/SAGA.


Assuntos
Algoritmos , Neoplasias , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias/genética , Análise por Conglomerados
19.
BMC Bioinformatics ; 24(1): 130, 2023 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-37016297

RESUMO

BACKGROUND: In the field of genomics and personalized medicine, it is a key issue to find biomarkers directly related to the diagnosis of specific diseases from high-throughput gene microarray data. Feature selection technology can discover biomarkers with disease classification information. RESULTS: We use support vector machines as classifiers and use the five-fold cross-validation average classification accuracy, recall, precision and F1 score as evaluation metrics to evaluate the identified biomarkers. Experimental results show classification accuracy above 0.93, recall above 0.92, precision above 0.91, and F1 score above 0.94 on eight microarray datasets. METHOD: This paper proposes a two-stage hybrid biomarker selection method based on ensemble filter and binary differential evolution incorporating binary African vultures optimization (EF-BDBA), which can effectively reduce the dimension of microarray data and obtain optimal biomarkers. In the first stage, we propose an ensemble filter feature selection method. The method combines an improved fast correlation-based filter algorithm with Fisher score. obviously redundant and irrelevant features can be filtered out to initially reduce the dimensionality of the microarray data. In the second stage, the optimal feature subset is selected using an improved binary differential evolution incorporating an improved binary African vultures optimization algorithm. The African vultures optimization algorithm has excellent global optimization ability. It has not been systematically applied to feature selection problems, especially for gene microarray data. We combine it with a differential evolution algorithm to improve population diversity. CONCLUSION: Compared with traditional feature selection methods and advanced hybrid methods, the proposed method achieves higher classification accuracy and identifies excellent biomarkers while retaining fewer features. The experimental results demonstrate the effectiveness and advancement of our proposed algorithmic model.


Assuntos
Algoritmos , Máquina de Vetores de Suporte , Biomarcadores , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Benchmarking
20.
BMC Bioinformatics ; 24(1): 150, 2023 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-37069540

RESUMO

BACKGROUND: Gene expression profiling is a widely adopted method in areas like drug development or functional gene analysis. Microarray data of gene expression experiments is still commonly used and widely available for retrospective analyses. However, due to to changes of the underlying technologies data sets from different technologies are often difficult to compare and thus a multitude of already available data becomes difficult to use. We present a web application that abstracts away mathematical and programmatical details in order to enable a convenient and customizable analysis of microarray data for large-scale reproducibility studies. In addition, the web application provides a feature that allows easy access to large microarray repositories. RESULTS: Our web application consists of three basic steps which are necessary for a differential gene expression analysis as well as Gene Ontology (GO) enrichment analysis and the comparison of multiple analysis results. Genealyzer can handle Affymetrix data as well as one-channel and two-channel Agilent data. All steps are visualized with meaningful plots. The application offers flexible analysis while being intuitively operable. CONCLUSIONS: Our web application provides a unified platform for analysing microarray data, while allowing users to compare the results of different technologies and organisms. Beyond reproducibility, this also offers many possibilities for gaining further insights from existing study data, especially since data from different technologies or organisms can also be compared. The web application can be accessed via this URL: https://genealyzer.item.fraunhofer.de/ . Login credentials can be found at the end.


Assuntos
Perfilação da Expressão Gênica , Software , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reprodutibilidade dos Testes , Estudos Retrospectivos , Perfilação da Expressão Gênica/métodos , Expressão Gênica , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...